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This study leverages naturally occurring lotteries for oversubscribed Boston Public Schools prekindergarten 
program sites between 2007 and 2011, for 3,182 children (M = 4.5 years old) to estimate the impacts of win- 
ning a first choice lottery and enrolling in Boston prekindergarten versus losing a first choice lottery and not 
enrolling on children’s enrollment and persistence in district schools, grade retention, special education place- 
ment, and third-grade test scores. There are large effects on enrollment and persistence, but no effects on other 
examined outcomes for this subsample. Importantly, children who competed for oversubscribed seats were 
not representative of all appliers and almost all control-group children attended center-based preschool. Find- 
ings contribute to the larger evidence base and raise important considerations for future prekindergarten lot- 


tery-based studies. 


Decades of research have shown that attending pre- 
school improves children’s cognitive and socio-emo- 
tional skills at kindergarten entry (Duncan & 
Magnuson, 2013). This evidence, along with dra- 
matic increases in maternal employment, has 
helped to fuel currently high levels of public sup- 
port and parental demand for public preschool. Via 
a combination of public dollars and parental spend- 
ing, attending preschool is now the typical experi- 
ence for U.S. 4-year-olds (Chaudry, Morrissey, 
Weiland, & Yoshikawa, 2017). In all, 69% of 4-year- 
olds attend some form of center-based preschool in 
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the year before they enter kindergarten, though 
children from the top income quintile are much 
more likely to attend preschool than those in the 
bottom quintile (83% vs. 50%, respectively; White- 
hurst & Klein, 2015). Approximately 43% of 4-year- 
olds access preschool through public funding, via 
state or local prekindergarten programs or Head 
Start (Barnett et al., 2017). 

While the evidence is nearly incontrovertible that 
children who attend preschool enter kindergarten 
better ready to learn (Phillips et al., 2017; Yoshi- 
kawa et al., 2013), questions about how long the 
benefits of preschool persist are long-standing, dat- 
ing back to the first major public investment in pre- 
school in the United States—Head Start in the 
1960s (Cicirelli, 1969). The overall pattern in the 
older literature is that the language, literacy, and 
mathematics test scores of preschool participants 
and nonparticipants tend to converge in the early 
elementary grades (ie., by around third grade), 
sometimes partially and sometimes fully (Phillips 
et al., 2017; Yoshikawa et al., 2013). But in the 
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studies examining long-run effects, preschool partic- 
ipants tend to outperform nonparticipants on a 
wide range of behavioral, health, and educational 
outcomes into adulthood. Evidence from modern- 
day, scaled-up programs so far largely mirrors this 
medium-term pattern, though a group of experts 
recently concluded that such evidence “is sparse, 
precluding broad conclusions” (Phillips et al., 2017). 
Furthermore, long-run evidence is not yet available 
for modern-day, large-scale programs. 

In this study, we address current needs in the lit- 
erature using lotteries for oversubscribed program 
sites as a window into the medium-term effects of 
the Boston Public Schools (BPS) prekindergarten 
program. Specifically, using data from four cohorts 
of students, we examine whether children who won 
their first choice lottery and enrolled in Boston 
prekindergarten benefit more than children who 
lost their first choice lottery and ultimately did not 
enroll in BPS prekindergarten. Our outcomes are 
drawn from administrative records and include 
third-grade state-standardized reading and mathe- 
matics test scores, K-2 grade retention, and K-3 spe- 
cial education placement. We also examine whether 
prekindergarten leads children to enroll and persist 
in the BPS at higher rates, as one of the program’s 
goals was to attract and retain families that might 
otherwise not have enrolled in BPS schools. Finally, 
to contextualize the findings, we descriptively 
examine children’s post-prekindergarten schooling 
environments. 

Importantly, given both calls in the field for 
more rigorous longitudinal studies of prekinder- 
garten (Phillips et al., 2017) and increasing attention 
to external validity (Stuart, Bradshaw, & Leaf, 2015; 
Tipton, 2014), the effects of prekindergarten enroll- 
ment that we estimate apply to the subgroup of lot- 
tery compliers—that is children who won or lost 
their first choice lottery and either enrolled in the 
program (first choice lottery winners) or did not 
enroll in the program at all (first choice lottery 
losers). As we describe, our lotteries were highly 
concentrated in a small subset of BPS schools 
(e.g., 75% of lottery applicants competed for about 
a quarter of eligible district schools) and the chil- 
dren who competed for oversubscribed seats were 
more advantaged than the average applicant. Virtu- 
ally all the control group members attended other 
center-based preschool programs, an unusual coun- 
terfactual in the public prekindergarten evaluation 
literature. To assess external validity, we followed 
the example of Abdulkadiroglu, Angrist, Dynarski, 
Kane, and Pathak’s (2011) seminal lottery-based 
study and used data on the full set of program 


applicants and enrollees to examine the generaliz- 
ability of our results through descriptive and 
quasi-experimental analyses. These analyses are 
important given recent attention to how effects for 
compliers may not represent a generalizable test of 
the effects of a program on all members of a target 
population (Chyn, 2018). 


When Might Prekindergarten Benefits Persist? 


Multiple theoretical frames are relevant to exam- 
ining whether and when attending prekindergarten 
might boost children’s medium-term academic and 
school progress outcomes. First, the human capital 
accumulation theory from economics posits that a 
strong early foundation sets the stage for acquiring 
more advanced skills. Heckman (2000) referred to 
this perspective as “learning begets learning, skill 
begets skill.” Second, developmental cascades the- 
ory, which has its origins in the field of develop- 
mental psychology, describes the processes by 
which antecedent conditions have different proba- 
bilities of leading to particular outcomes; function- 
ing at a particular level or in a_ particular 
developmental domain is hypothesized to affect 
later competencies in multiple domains (Masten & 
Cicchetti, 2010). A third theoretical perspective for 
expecting persistence is based on _ transactional 
developmental theory (Sameroff, 2009)—that is, the 
reciprocal effects of child skills and environmental 
inputs on subsequent teacher behaviors, and effects 
of such behaviors in turn on students. Following 
this theory, the prekindergarten boost may persist 
because participants’ later-grade teachers may 
respond to their students increased skill level by 
increasing the instructional opportunities that they 
offer students. There also may be _ observer- 
expectancy effects in which teachers may either 
consciously or subconsciously behave in ways that 
facilitate students’ progress in accordance with their 
own expectations of the students (Weinstein, 2004). 

Most recently, Bailey et al. (2017) built on these 
theories and offered three hypotheses for the persis- 
tence (or not) of a preschool boost. First, their “sus- 
taining environments” hypothesis posits that the 
quality (broadly defined) of children’s educational 
settings after preschool is critical in sustaining the 
preschool boost. As an example, repeating the same 
content in kindergarten as in preschool would not 
be a sustaining environment for the preschool 
boost. Having a high percentage of peers who are 
well prepared for kindergarten might spark their 
teacher to increase rigor and therefore sustain the 
boost. Second, their “foot-in-the-door” hypothesis 


posits that attending preschool may get children 
over an important hurdle in their K-plus experi- 
ences and thereby grant them access to a benefit or 
allow them to avoid a harm (e.g., unwarranted spe- 
cial education placement). They also hypothesized 
that another key to convergence of outcomes of 
attenders and nonattenders could be which skills 
are emphasized and measured in the prekinder- 
garten through third-grade period. They argue for a 
boost to last, the focal skills must be malleable, fun- 
damental for success, and unlikely to develop in 
the counterfactual. The boost from a_prekinder- 
garten program that focuses on constrained skills 
(Snow & Matthews, 2016)—for example, the dis- 
crete set of basic literacy and mathematics skills 
that almost all children master by third grade such 
as letter knowledge and simple counting—is likely 
to be less enduring than the boost from a program 
that focuses on students’ deeper unconstrained 
skills, meaning more broadband skills like world 
knowledge, vocabulary, conceptual thinking, and 
problem solving. 

At this juncture, it remains unclear which of 
these theories best describes patterns in the empiri- 
cal evidence base. Empirically, in the medium term, 
the older evidence has shown that preschool has 
small-to-moderate effects in reducing grade reten- 
tion and special education placement in the 
K-12 years (McCoy et al., 2017; Yoshikawa, Wei- 
land, & Brooks-Gunn, 2016). In both older and 
more recent studies, language, literacy, and mathe- 
matics test scores between preschool participants 
and nonparticipants tend to partially or fully con- 
verge by the end of third grade, though some stud- 
ies do show some evidence of medium-term 
persistence (e.g., Bassok, Gibbs, & Latham, 2018; 
Hill, Gormley, & Adelstein, 2015; Ladd, Muschkin, 
& Dodge, 2014; Lipsey, Farran, & Durkin, 2018; 
Phillips et al., 2017; Puma et al., 2012). Recent work 
on the trajectory of effects suggests most of the 
eventual medium-term convergence between pre- 
school attenders and nonattenders occurs within 1— 
2 years after preschool (Hojman, 2015). Specifically, 
about half of the eventual convergence on cognitive 
outcomes occurs during kindergarten and then by 
about half again by the end of second grade (Li 
et al., 2016). 

The relatively small number studies that have fol- 
lowed preschool participants into adulthood have 
found long-term benefits such as increases in college 
enrollment, decreases in incarceration rates, and 
decreases in teen pregnancy, even when in the med- 
ium-term there is convergence in test scores (Deming, 
2009; Gibbs, Ludwig, & Miller, 2011; Yoshikawa et al., 
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2016). However, the jury is still out on whether 
today’s preschool programs will yield long-term bene- 
fits to participants and society similar to programs 
from earlier decades, particularly in settings in which 
participants show medium-term fadeout. By neces- 
sity, all the longer term evidence is from participants 
who attended preschool decades ago and there are 
important differences in context between older studies 
versus those of today’s preschools and preschoolers. 
Parents of all social classes today invest more time 
and money in their children’s learning, on average, 
than in previous generations (Bassok, Finch, Lee, 
Reardon, & Waldfogel, 2016; Reardon, 2011). Also, 
more children attend nonparental care than in the 
past, changing the counterfactual against which a 
given preschool program is evaluated (Chaudry et al., 
2017). Previous work suggests the counterfactual 
plays a substantial role in preschool evaluations. In a 
re-analysis of the Head Start Impact Study, Feller, 
Grindal, Miratrix, and Page (2016), for example, found 
persistence of positive effects on language through 
first grade only for children who in the absence of 
Head Start would have been at home with their par- 
ents and not for children who otherwise would have 
been enrolled in another preschool program. 

These more modern-day findings regarding the 
counterfactual are particularly relevant to this study 
because, as we detail further in our findings section, 
an unusually high percentage of our control group 
compliers (88%) attended a center-based preschool 
program other than the Boston program and only 
6% stayed home with a parent. As such, our study 
sits between two types of studies: (a) studies that 
compare a given preschool program against a more 
mixed counterfactual; and (b) studies in which all 
children attend the same preschool program but 
some attend an enhanced version. Examples of the 
former include the recent Tulsa quasi-experimental 
propensity score studies in which Tulsa prekinder- 
garten is compared to a counterfactual in which 
48% of children were in other center-based pre- 
school programs (Hill et al., 2015) and the Ten- 
nessee VPK study in which 34% of comparison 
group was in other center-based preschool pro- 
grams (Lipsey et al., 2018). Examples of the latter 
type of study include a recent preschool mathemat- 
ics curricula trial that followed children into first 
grade, with the treatment group receiving an 
enhanced preschool experience and the control chil- 
dren business as usual preschool (Jenkins et al., 
2018). 

Increasingly, the evolving prekindergarten context 
adds nuance to understanding the effects of today’s 
programs as well as raises new outcomes of interest. 
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For example, one of our study’s key medium-term 
outcomes—post-prekindergarten enrollment in the 
BPS—has not been a focus in the literature to date, 
though it increasingly is a focus of localities that 
administer such programs. The only relevant evi- 
dence we are aware of is a recent study that found 
that Tulsa prekindergarten alumni were somewhat 
more likely to persist in the Tulsa Public Schools than 
were non-Tulsa prekindergarten and non-Tulsa 
Head Start attenders through eighth grade (Gormley, 
Phillips, & Anderson, 2018). There are, however, on- 
the-ground reports that schools in both New Orleans 
and DC—in which parent choice is a central feature 
of school assignment—have chosen to offer 
prekindergarten as a strategy to attract and retain 
families (D. Ewen, personal communication, June 19, 
2017; Weixler, Lincove, & Gerry, 2017). In DC, school 
re-enrollment is also now a measure of school suc- 
cess/ progress under the Every Student Succeeds Act 
(U.S. Department of Education, 2017). Seattle has 
framed its prekindergarten program not explicitly as 
a family retention strategy but as part of its afford- 
ability agenda (Slote & Kelly, 2015)—though pre- 
sumably greater affordability would allow more 
young families to stay in the city. We expect that 
both the increasingly competitive educational mar- 
kets and the rising cost of living in large cities will 
lead to increasing focus on the effects of public 
prekindergarten on K+ enrollment decisions. And 
we expect more broadly that the changing prekinder- 
garten landscape will lead to attention to other out- 
comes new to the literature. 


The Boston Prekindergarten Program and the Boston 
K-3 Context 


The Boston Public Prekindergarten program is a 
modern-day, relatively large-scale program that has 
been of interest in the recent literature because of 
its programmatic elements and its documented 
strong impacts on children’s school readiness. The 
program began an expansion in 2005-2006, under 
the decree of then-Mayor Thomas Menino who 
argued that in addition to preparing children for 
school, the program could help attract families to 
the BPS who might otherwise leave or choose other 
options. The program is based entirely in the public 
schools, pays teachers on the same scale as K-12 
teachers, and subjects teachers to the same educa- 
tional requirements of K-12 teachers (e.g., a master’s 
degree within 5 years). Furthermore, it is open to 
any child in the city, regardless of income. In our 
study years, about one third of all 4-year-olds in 
Boston enrolled in the program and about half of 


all children who enrolled in BPS kindergarten had 
attended BPS prekindergarten the year before (Sha- 
piro, Martin, Weiland, & Unterman, 2019). 

Since 2007, the program also has utilized a con- 
sistent curricula and coaching system. Specifically, 
the district implemented Opening the World of Learn- 
ing, which targets children’s early language and lit- 
eracy skills and includes a social-skills component 
embedded in each unit, in which teachers discuss 
socioemotional issues with children and integrate 
emotion-related vocabulary words (Schickedanz & 
Dickinson, 2005). It also implemented Building 
Blocks, an early mathematics curriculum which cov- 
ers both numeracy and geometry and has a heavy 
focus on verbal mathematical reasoning (Clements 
& Sarama, 2007a). Both curricula have shown posi- 
tive effects on children’s outcomes in other studies 
(Ashe, Reed, Dickinson, Morse, & Wilson, 2009; 
Clements & Sarama, 2007b; Clements, Sarama, Spi- 
tler, Lange, & Wolfe, 2011), though the evidence 
base for Building Blocks is stronger than that for 
OWL (Weiland & Yoshikawa, 2013). 

In two of our four focal years (2007-2008 and 
2008-2009), curricula implementation was sup- 
ported via trainings and regular coaching, meaning 
weekly to biweekly on-site support from an experi- 
enced early childhood coach trained in both curric- 
ula. Thereafter, due to budget cuts, coaching was 
targeted to new teachers and to prekindergarten 
and kindergarten teachers in schools undergoing 
National Association for the Education of Young 
Children Accreditation, a quality assurance process 
used in early childhood settings nationally. On the 
whole, Boston’s structural and programmatic 
choices make it fairly unique among public pro- 
grams nationally which tend not to require master’s 
degrees, usually do not pay prekindergarten teach- 
ers on the same scale as K-12 teachers, do not 
require a proven, consistent curriculum, and do not 
employ coaching (Barnett et al., 2017). 

The quality of the Boston program has been 
investigated in prior work using standard class- 
room observational tools. Boston classrooms score 
similarly to other systems nationally on structural 
quality and on emotional support (Weiland, Ulves- 
tad, Sachs, & Yoshikawa, 2013). However, Boston 
has the highest average instructional quality of a 
large-scale program to date (Chaudry, Morrissey, 
Weiland, & Yoshikawa, 2017), scoring, for example, 
in the 2009-2010 school year 1.7 to 2.4 SDs higher 
on this dimension than current Head Start quality 
nationally (Weiland, 2016). It also showed strong 
effects on the language, literacy, mathematics, and 
executive function skills at kindergarten entry of 


children who attended the program in 2008-2009 in 
a large-scale regression discontinuity study that 
used the program’s long-standing September 1 cut- 
off as its source of exogeneity (Weiland & Yoshi- 
kawa, 2013). Importantly, the care settings for 
control group children were relatively stronger than 
has typically been the case in past such studies 
because Massachusetts has some of the strongest 
child care standards nationally and approximately 
two-thirds of control-group children were enrolled 
in nonparental care during the treatment year, with 
about 57% in other center-based preschool pro- 
grams (Weiland & Yoshikawa, 2013). 

Post-prekindergarten in our study’s focal years, 
district K-3 teachers implemented the literacy cur- 
riculum Reading Street and the mathematics curricu- 
lum TERC Investigations. These curricula do not 
have a strong evidence base compared to the pre-k 
curricula used in the district (Agodini, Harris, Tho- 
mas, Murphy, & Gallagher, 2010; Gatti & Petro- 
chenkov, 2010; Ladnier-Hicks, McNeese, & Johnson, 
2010; What Works Clearinghouse, 2013), nor were 
they supported by coaching and training as system- 
atically or as frequently as the pre-k program’s sup- 
ports. Reflective of these differing investment 
levels, classroom quality data collected by the 
Wellesley Centers for Women in spring 2012 on 84 
K-3 classrooms in BPS and in spring 2010 on 83 
prekindergarten classrooms and reanalyzed by our 
study team show that prekindergarten classroom 
instructional quality was markedly higher on aver- 
age than K-3 instructional quality (see Table 1 in 
Appendix $1). Notably, the district responded to 
this evidence and other related evidence by subse- 
quently (not in our study focal years) developing 
its own K-2 curriculum and associated professional 
development program (Boston Public Schools, 
2017). 


Current Study 


Using data from four cohorts of students who 
applied to the BPS prekindergarten program 
between the 2007-2008 and the 2010-2011 school 
year, we aimed to investigate the effects of enrol- 
ling in the Boston prekindergarten program versus 
students’ other options. Ultimately, consistent with 
other lottery-based studies (explained in detail in 
the next section), we were able to leverage oversub- 
scribed first choice lotteries to address our central 
research question: What is the effect of enrolling in 
a Boston prekindergarten program versus not at all 
on children’s enrollment and persistence in BPS 
grades K-3; children’s risk of being retained in 
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grade in K-2 or of being classified as special-needs 
in K-3; and children’s third-grade state-standard- 
ized test scores in mathematics and reading? 


Method 
Data Set 


We use data from BPS and the Massachusetts 
Department of Elementary and Secondary Educa- 
tion. We begin with data on students’ choices and 
baseline demographics during the BPS assignment 
process from the spring of the 2006-2007 through 
2009-2010 school years (for enrollment in 2007— 
2008 through 2010-2011). We merge these data, 
using each student’s unique identifier, with district 
and state administrative records covering the years 
students were age-eligible for prekindergarten (at 
age 4) through third grade. 


Sample 


Our sample comes from the population of stu- 
dents who applied to the Boston prekindergarten 
program for 4-year-olds. As shown in Appendix S1 
Figure 1, in all, 12,740 families applied to the pro- 
gram in our focal years. Nearly 10,000 of these fam- 
ilies applied to the district’s school choice lottery 
(described in greater detail in the next section) in 
the spring before their child was age-eligible for the 
program. This is what we call the “standard pro- 
cess”; it included four rounds and from these 
rounds, we identified naturally occurring lotteries 
for students’ first choice school involving 3,182 stu- 
dents, or 25% of all appliers and 32% of those who 
applied through the standard process. The distribu- 
tion of the lottery sample across the four rounds of 
the standard process is as follows: 99% of the sam- 
ple is drawn from round one, < 1% is drawn from 
round two, < 1% is drawn from round three, and 
no students are drawn from round four. Another 
2,769 (22% of appliers) applied via a later process 
after the four rounds had concluded. Their applica- 
tions were considered on a rolling, as-space-is-avail- 
able basis and they are not part of our lottery 
sample. 

Our lottery sample was diverse in their back- 
ground characteristics. As shown in Table 1, for 
example, 35% of first choice lottery winners and 
40% of control group members were Hispanic; 
about a quarter overall were Black; another quarter 
were White; and the rest (~13%) were Asian or 
Other. About 58% of first choice lottery winners 
was eligible for free/reduced lunch (57% for control 
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group children). A little over half spoke English at 
home, a quarter spoke Spanish, and around 20% 
spoke a non-English and non-Spanish home lan- 


guage. 


Boston Prekindergarten School Assignment Process 
Details 


Under the BPS’s school choice plan, in the winter 
and spring of each school year, families could apply 
to up to 10 schools they wanted their child to 
attend for prekindergarten the following fall (ie., 
unlike most other systems, children were not 


Table 1 
Balance on Observables in the First Choice Lottery Sample 


Lottery Control Estimated 
winners group difference p-Value 
Race/ethnicity (%) 
Hispanic 35:22 39.90  —4.68** .003 
Black 25.00 23.35 1.65 271 
White 26.73 24.34 2.39 144 
Asian 10.13 7.28 2.85** .005 
Other 2.92 414 —1.22 .146 
Male (%) 50.27 46.95 3.32 .126 
Eligible for free/ 57.66 56.68 0.98 .604 
reduced lunch (%) 
Age 4.51 4.53 —1.97 117 
Country of origin 94.89 94.53 0.36 701 
USA (%) 
Home language (%) 
English 52.18 55.06 —2.88 -133 
Spanish 25.18 25.19 —0.01 994 
Other 22.64 19.75 2.89 .074 
N children 1,101 2,081 


Note. There was a small amount of missing data on all baseline 
characteristics except age: 12 children (0.4%) were missing race/ 
ethnicity and male information, 34 (1.1%) were missing male and 
free/reduced lunch information, 113 (4.2%) were missing country 
of origin information, and 5 (0.2%) were missing home language 
information. Means in the table were computed using nonmissing 
data. Values for first choice lottery winners are the simple means 
for each requisite group. Values for the difference between lottery 
winners and control group members are obtained from a regres- 
sion of a given baseline characteristic on a series of indicator vari- 
ables that identify each lottery plus an indicator variable that 
equals 1 for lottery winners and 0 for lottery losers. The coefficient 
on lottery indicator equals the difference in the mean baseline char- 
acteristic between lottery winners and control group members, 
respectively. The value for control group members equals the 
corresponding value for lottery winners minus the estimated 
difference between lottery winners and control group members. A 
two-tailed t-test was applied to the estimated differences. An F-test 
was used to assess the statistical significance of the overall differ- 
ence between lottery winners and control group members reflected 
by the full set of baseline characteristics in the table. The resulting 
F value is not statistically significant (p = .2004). Statistical signifi- 
cance levels are indicated as: ** p <.01. 


automatically assigned to their neighborhood 
school). Families were assigned different priorities 
to different schools based on criteria set by the dis- 
trict, such as sibling and walk zone priority, sibling 
priority only, walk zone priority only, and no prior- 
ity (listed in order from most to least priority). 
Importantly, when there was more demand than 
supply for a given school, the assignment algorithm 
used family choice lists, school priorities, and a ran- 
dom number to randomly assign some students 
(and not others) to the school. 

In the present analysis, we used data from stu- 
dents’ first application to prekindergarten to iden- 
tify naturally occurring lotteries among students 
with the same preference to the same oversub- 
scribed school/program (e.g., two students who 
listed school A’s regular education program who 
both had walk-zone priority to it) that listed the 
program as their first choice. As discussed in the 
following section, we constrained our sample to 
students’ first choice lotteries as only these students 
clearly participated in the equivalent of an experi- 
ment. Appendix S2 includes more details on the 
district’s school application and lottery process and 
how we identified the lotteries used in our analysis. 

Appendix S1 Table 2 displays the number of 
applicants to the prekindergarten program in the 
focal years, the number of lottery sample members, 
and the percentage of lottery participants each year. 
As mentioned earlier, across the study years, 25% 
of applicants were in an experimental lottery. The 
percentage of district schools represented across 
years in the lotteries ranged from 67% to 83% 
across years. However, some schools were highly 
over-represented and others were under-repre- 
sented in the lottery sample—for example, about 
half of the students competed for just seven schools 
(10% of schools with prekindergarteners during this 
time period) and about 75% competed for just 18 
schools (26% of schools with prekindergarteners 
during this time period). 

Ultimately, although all lottery winners were 
offered the opportunity to enroll in the BPS prekinder- 
garten program, 91% did so, according to BPS admin- 
istrative records. Approximately 90% of lottery 
winners enrolled in their first choice school, 2% 
enrolled in a school not in their initial choice list, and 
9% did not enroll in Boston prekindergarten. While all 
the control group students lost the first lottery they 
competed in, an estimated 62% of them enrolled in 
the program either by coming off of a waitlist, win- 
ning a subsequent lottery, being assigned to an under- 
subscribed school farther down their choice list, or 
participating and being assigned in a subsequent 


assignment round. Ultimately, roughly 13% of the 
control group enrolled in their first choice school, 29% 
enrolled in a school lower on their choice list, 23% 
enrolled in a school not on their initial choice list, and 
35% did not enroll in Boston prekindergarten.’ Taken 
together, this suggests an estimated BPS prekinder- 
garten enrollment rate difference of 29 percentage 
points (91% minus 62%), a difference that is low but 
not uncommon in research designs utilizing naturally 
occurring lotteries within choice processes (Abdulka- 
diroglu, Angrist, Narita, & Pathak, 2015; Angrist, 
Cohodes, Dynarski, Pathak & Walters, 2016). In the 
data analysis section, we describe how we use these 
lotteries to estimate the effect of enrolling for the target 
of our analysis—those that won a seat in their first 
choice BPS prekindergarten and _ subsequently 
enrolled versus those lotteried out of the program 
who do not enroll (i.e., the compliers). 


Outcomes 
Enrollment and Persistence in BPS 


From district administrative records, we coded 
whether students enrolled in the BPS in kinder- 
garten through third grade. If a student enrolled in 
BPS at least 1 day in a given year, we set the enroll- 
ment variable for that year to 1 and to zero other- 
wise. From our yearly variables, we constructed a 
0/1 coded “ever enrolled” variable for kindergarten 
to third grade and 0/1 “persistence” variable for 
continuous enrollment for kindergarten to third 
grade. As previously mentioned, we included these 
outcomes because though unusual in the literature, 
they are increasingly relevant to public prekinder- 
garten programs and because attracting families to 
the BPS who might otherwise leave or choose other 
options was an original goal for the program. 


Grade Retention and Special Needs Placement 


From administrative records, we constructed 
year-by-year measures of children’s K-2 grade 
retention and K-3 special needs placement, defined 
as having an Individualized Education Plan (IEP). 
We also constructed measures of whether the child 
was ever retained from K-2 or ever had an IEP in 
K-3. 


'These are simple counts on control group enrollment rates, pre- 
sented for descriptive purposes. They approximate but not do 
perfectly reproduce the 62% crossover rate estimated using our 
main analytic model and presented in Table 3. 
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Notably, since 2008, the average districtwide 
retention rate in Grades 1-3 in Boston has ranged 
from 2.9% to 7.5% at each grade level, meaning that 
the percentage of students ever retained by the end 
of third grade is around 10% (e.g., 2.9% and 7.5% 
averaged and multiplied by three; Massachusetts 
Department of Elementary & Secondary Education, 
2013). Nationally, the yearly annual retention rate 
was 1.5% in 2010 (Warren, Hoffman, & Andrew, 
2014). Regarding special education placement, Mas- 
sachusetts has the second highest rate of special 
education placement in the United States (Hehir, 
Grindal, & Eidelman, 2012). Approximately 19% of 
BPS elementary-school students in 2012 had been 
diagnosed with a disability (T. Grindal, personal 
communication, June 9, 2013). 


Third-Grade Standardized Test Scores 


For third-grade reading and mathematics analy- 
ses, we use students’ statewide mathematics and 
reading standardized tests. Cohorts 1, 2, and 3 took 
the Massachusetts Comprehensive Assessment Sys- 
tem (MCAS) in third grade, the test used for state 
accountability purposes in Massachusetts (see 
Appendix S3 for psychometric details). In 2015, the 
state of Massachusetts gave districts the choice 
between continuing to administer the MCAS or 
administering instead a new mathematics and Eng- 
lish Language Arts (ELA) exam based on the Com- 
mon Core standards, called the Partnership for 
Assessment of Readiness for College and Careers 
(PARCC) assessment (Massachusetts Department of 
Elementary & Secondary Education, 2015). In all, 
54% of districts in the state switched to the PARCC, 
whereas the rest continued to administer the 
MCAS. In the three largest school districts in the 
state—Boston, Worcester, and Springfield—individ- 
ual schools chose which test to administer. In Bos- 
ton, all but two schools with third-grade students 
chose to administer the PARCC. 

Amidst these changes, the state recommended 
that researchers standardize students’ estimated 
theta (i.e., IRT) scores when conducting analyses 
that require pooling across the MCAS and PARCC 
exams (Massachusetts Department of Elementary & 
Secondary Education, 2016). We followed this 
advice and standardized each student’s theta score 
on the mean and standard deviation of all third 
graders within the BPS taking the given exam in 
that year. Test score data in this article accordingly 
can be interpreted as a given group’s performance 
compared to the average BPS third grader. For both 
the MCAS and the PARCC, if students were 
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retained, we used their score from their first third- 
grade test administration. 


Covariates 


Using administrative records, we constructed a 
set of student-level covariates. We captured stu- 
dents’ race/ethnicity using a set of dichotomous 
variables that identified whether a student was 
Asian, Black, Hispanic, White, or mixed/other. Sim- 
ilarly, we used a set of dichotomous variables to 
identify whether the students’ home language was 
English only, Spanish, or another language. Using 
student birthdates, we calculated students’ age as 
of September 1 in the year they were applying to 
prekindergarten. We also created dichotomous vari- 
ables that identified whether the student was eligi- 
ble for free-reduced priced lunch; whether the 
student was male; and whether the student’s coun- 
try of origin was the United States. 


School Context Variables 


To capture each student’s school experience in 
every follow-up year, we drew on publicly available 
data from the Massachusetts Department of Elemen- 
tary and Secondary Education (n.d.), which we 
merged on to each student’s data row by follow-up 
year and enrolled school ID. If the student was 
enrolled in multiple schools in a given year, we used 
the value for the school in which the student was 
enrolled the longest. We included indicators of the 
school-level student sociodemographic characteris- 
tics—percentage of students from low-income fami- 
lies (see Appendix S3 regarding a definition change 
in this measure in our last study year); the school’s 
percentage of English language learners; percentage 
of students with a non-English home language; per- 
centage of students with disabilities; percentage of 
students who were African American, Asian, His- 
panic, or White; and percentage of female students. 
For schools’ academic context, we included the per- 
centage of third-grade students who were proficient 
or higher on state ELA and mathematics standard- 
ized tests (for Cohorts 1-3, the MCAS and for Cohort 
4, either the MCAS or the PARCC, depending on 
which was used in the students’ schools in third 
grade). Finally, we also included measures of 
schools’ percentage of licensed teachers, student /tea- 
cher ratio, percentage of teachers rated as exemplary 
or proficient in the state’s rating system, percentage 
of teachers retained or remained working in the 
same position compared to the previous school year, 
the percentage of students who remain in the school 


throughout the school year (stability rate), and aver- 
age class size. Percentage of teachers rated as exem- 
plary or proficient in the state’s rating system and 
average class size were available for Cohorts 3 and 4 
only. We averaged the characteristics of students’ 
schools across the K-3 grades to create our key ana- 
lytic variables. 


Prekindergarten Year Care Settings 


For our first two cohorts, when students applied 
to the BPS, their parents answered a set of ques- 
tions about their child’s last child care experience. 
We used these data to identify the care setting of 
children not enrolled in BPS prekindergarten (e.g., 
the counterfactual)—Head Start, private preschool, 
family day care, or parental/relative care. We also 
used state administrative records that captured 
whether a student attended preschool in a tradi- 
tional public school or a charter school. We used 
district administrative records from the prekinder- 
garten year to identify which sample children 
attended BPS prekindergarten. 

The district changed its data collection form for 
this information for Cohort 3 and Cohort 4 such 
that setting type was not available to our study 
team. For this reason, we used control group care 
setting data for the first two cohorts only. More 
details on these data are available in Appendix S3. 


Data Analytic Plan 


To estimate the impacts of enrolling in the BPS 
prekindergarten program on study outcomes, we 
utilized naturally occurring lotteries in the Boston 
choice system. As is common when applying this 
experimental, lottery-based approach, our first step 
was to estimate the effect of being offered the opportu- 
nity to enroll in a Boston prekindergarten school 
(intent-to-treat [ITT]) using students’ first choice 
applications (Abdulkadiroglu et al., 2011; Bloom & 
Unterman, 2014; Dobbie & Fryer, 2011). We con- 
strained our sample to students’ first choice lotteries 
because when a student is competing in any lottery 
other than her first lottery, her probability of being 
assigned to a lower choice may depend in part on 
her earlier choices (and not just her random num- 
ber) and thus using these later lotteries could pose 
a threat to randomization (Bloom & Unterman, 
2014). While students may not compete in a lottery 
for their first choice and may compete in a lottery 
for a later choice, we focus only on their first choice 
lotteries to ensure that we have identified a purely 
experimental sample. 


Within our lottery-based research design, a set of 
students randomly “won” the opportunity to attend 
their first choice BPS prekindergarten program (the 
treatment group). Another set of students randomly 
“lost” the opportunity to attend their first choice BPS 
prekindergarten program (the control group). 
Because the lottery randomly assigns students, stu- 
dents in the treatment and control groups were, in 
expectation, equivalent in all measurable and unmea- 
surable characteristics. The basic approach for the 
analysis is to estimate, for each lottery, differences in 
mean outcomes for winners and control group mem- 
bers, and to average the results across lotteries. 

Specifically, we construct the following linear 
regression model: 


K P 
Yi = BT ij + > TL kj + x Op Xpii + &j, (1) 
k=1 p=1 
where Yj is a relevant short- or medium-term out- 
come for student 7 in lottery j; Tj is a lottery winner 
indicator equal to 1 if student i wins lottery j and 0 
otherwise; Ij; is a set of k lottery indicators equal to 1 
for lottery j and 0 otherwise; Xj is a set of p student- 
level covariates (race/ethnicity, gender, eligibility for 
free or reduced-price lunch, age, country of origin, 
and home language status); and ¢j is a random error 
for student i that is clustered by the prekindergarten 
school that students entered after their lottery. This 
latter information is available only for students that 
enroll in the Boston prekindergarten program. Thus 
for this purpose, we assume that students who do 
not enroll in the program—the majority of whom are 
in the control group—are not clustered together in 
another setting. The / coefficient identifies the effect 
of winning a lottery on student outcomes and its 
associated t-statistic identifies statistical significance. 
Our ITT estimates represent the effect of winning 
one’s first lottery and thus do not answer the ques- 
tion likely of most substantive interest to practition- 
ers and policymakers—the effect of enrolling in 
Boston prekindergarten versus not doing so. Thus 
we use students’ first lottery participation as an 
instrument for estimating the effects of BPS prekinder- 
garten enrollment— often referred to in the literature 
as a complier average causal effect (CACE; Gen- 
netian, Morris, Bos, & Bloom, 2005). In this context, 
the effect of enrollment represents the effect of enrol- 
ling in Boston prekindergarten for the subgroup of 
students—the compliers—who won their first choice 
lottery and enrolled in BPS prekindergarten com- 
pared with those that lost their first choice lottery 
and ultimately did not enroll in BPS prekindergarten. 
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As previously mentioned, the overwhelming major- 
ity of lottery winners enrolled in their first choice 
school, making our enrollment effect more specifi- 
cally represent the effect of enrolling in children’s first 
choice program, versus not at all. 

Because this approach is new to estimating the 
effects of prekindergarten (though it has been used 
in contexts with older children; Abdulkadiroglu 
et al., 2011; Bloom & Unterman, 2014; Dobbie & 
Fryer, 2011), it merits some additional explanation. 
In particular, what is a complier in this context? A 
complier is a student who randomly won or lost his / 
her first choice lottery and, for winners, enrolled in a 
Boston prekindergarten program (first choice or 
otherwise), and for lottery losers, did not enroll in 
Boston prekindergarten. Notably, some children 
who lost their first lottery won a slot to a school 
lower on their choice list and attended Boston 
prekindergarten in that school. Our estimates of the 
effect of enrollment ultimately do not apply to them 
(i.e., known as “always takers” in the literature), just 
as they do not apply to children who would not have 
enrolled in the program regardless of whether they 
won or lost their first lottery (i.e., “never takers”). 
Our instrument effectively carves out the exogeneous 
variation in enrollment that is due children’s first 
choice lottery result and uses it to estimate the causal 
effect of enrollment for the subgroup of compliers. Nota- 
bly, as previously mentioned, effects for compliers in 
some contexts have been shown not to generalize to 
the full population (Chyn, 2018). This aspect of our 
design is why we emphasize that our analysis is a 
window into the medium-term effects of the program, 
rather than an evaluation necessarily for all students. 
This is also why we also conduct multiple analyses 
of the generalizability of our results. 

To calculate CACE, we conducted a two-stage least 
squares analysis. The lotteries we drew on range in 
size with many of their samples being quite small. To 
avoid finite sample bias from “weak instruments” 
(Bound, Jaeger, & Baker, 1995), we estimated the 
CACE using a single-instrument model (also known 
as a Wald estimate). This approach has been used for 
past analyses of randomized experiments and lottery- 
based studies (Abdulkadiroglu et al., 2011; Bloom & 
Unterman, 2014; Gennetian et al., 2005; Ludwig & 
Kling, 2007). The first stage was specified as: 


K P 
Eq = BTy + .s Mel kij + 2 OXpij + Wy, (2) 
k=1 p=1 


where Ej is a BPS prekindergarten enrollment indi- 
cator equal to 1 if student 7 ever enrolled in BPS 
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prekindergarten and 0 otherwise, and all other 
terms are defined as in Equation 1. Our first-stage 
F-statistic equals 11, which is just above the recom- 
mended threshold for instrument strength (Bloom, 
Zhu, & Unlu, 2010; Bound et al., 1995). 

The second stage equation was specified as: 


K P 
Yi = bE; + x TL ij + y Op Xpii + ej, (3) 
k=1 p=1 


where Ey equals the fitted value of the enrollment 
outcome from the first-stage equation, ej is a ran- 
dom error that is clustered by the prekindergarten 
school that students entered after their lottery, and 
all other terms are defined as in Equation 1. The 
estimated value of 5 is a consistent estimate of the 
average effect of enrolling in BPS prekindergarten 
for target BPS prekindergarten enrollees. We fit our 
CACE models in SAS, using MDRC code described 
in detail in Bloom and Unterman (2014). Impor- 
tantly, while the ITT approach meets the What 
Works Clearinghouse’s highest standard of evi- 
dence, the CACE approach is considered quasi-ex- 
perimental (What Works Clearinghouse, 2014). 

Finally, there was a small amount of missing data 
on all covariates except age in our lottery sample, 
ranging from 0.4% to 4.2% (and likewise in our full 
sample, ranging from 0.4% to 4.0%; see the Table 1 
note). We imputed missing covariates as our primary 
approach in our lottery approach, using multiple 
imputation with 40 data sets. Our lottery estimates 
are not sensitive to problems of missing covariate 
data (see the “Robustness Checks” section in the fol- 
lowing section). We describe missing data on out- 
comes in the next section. 


Results 
Balance on Observables and Attrition Analysis 


We compared the background characteristics of 
first choice lottery winners and control group mem- 
bers in the lottery sample (see Table 1 for this ITT 
analysis). There are 2 (of 12) statistically significant 
differences between the two groups—lottery win- 
ners are 4.7 percentage points less likely to be His- 
panic (p = .003) and 2.9 percentage points more 
likely to be Asian (p = .005). A joint F-test used to 
assess the statistical significance of the overall dif- 
ference between the first choice lottery winners and 
control group members could not reject the null 
hypothesis that there was no difference between the 
two groups (p = .200). We controlled for these 


background characteristics (as is suggested by What 
Works Clearinghouse, 2014), both to improve preci- 
sion and, for the characteristics for which there was 
evidence of imbalance, to reduce the threat of possi- 
ble bias in our estimates. See Appendix S1 Table 3 
for the estimated complier averages for these same 
background characteristics. On average, compliers 
were quite similar to the full ITT lottery sample. 

In an analysis of the availability of our outcome 
data for our first choice lottery sample, we found 
that outcome data were missing at relatively low 
rates (3%-16%) and that differences in outcome 
missingness by treatment status were relatively 
small (1-5 percentage points more likely to be miss- 
ing for the control group, across outcomes). These 
levels of missingness meet the What Works Clear- 
inghouse’s (2014) standards for rigor. Six out of 11 
differences across outcomes were statistically signif- 
icant at conventional levels. The resulting F-value 
from a joint F-test of differences in the background 
characteristics of children with nonmissing outcome 
data by treatment status was not statistically signifi- 
cant. See Appendix S4 for full attrition details and 
results. 


Care Settings in the Prekindergarten Year 


Table 2 displays results from fitting our ITT and 
CACE models with care setting information in the 
prekindergarten year for Cohorts 1 and 2 as the 
outcomes, for children who had nonmissing coun- 
terfactual data. The results shown are important for 
identifying what Boston prekindergarten is being 
compared to in our study. From our ITT results, 
97% of lottery treatment group members enrolled 
in BPS prekindergarten and nearly all the treatment 
group members (99.6%) enrolled in some kind of 
center-based preschool. In the lottery control group, 
97% enrolled in some kind of center-based pre- 
school—substantially exceeding the national aver- 
age of 69% (Whitehurst & Klein, 2015). Overall, 
72% of control group members enrolled in BPS 
prekindergarten, 14% in private centers, 4% in 
Head Start, 4% in charters, 3% in other public pro- 
grams, 1% in family day cares, and 2% were at 
home with a family member. 

By definition, for our BPS enrollment effect esti- 
mates, all our treatment group compliers attended 
BPS and none of control group compliers did so. 
Among control group compliers, 88% were in other 
center-based preschool programs. All told, 48% of 
control group compliers attended private programs, 
17% Head Start, 12% charters, 12% other public 


Table 2 
Children’s Care Settings (Cohorts 1 and 2) in the Prekindergarten Year 


ITT 
Lottery Control Estimated 
winners group difference 
Any center-based preschool 99.55 96.72 2.83*** 
Preschool types 
BPS 96.64 72.26 24.39%** 
Non-BPS center-based preschool 2.91 24.47  —21.56*** 
Private 2.01 13.62 —11.60*** 
Head Start 0.00 4.18 —4,18*** 
Public 0.00 2.81 =2.81*** 
Charter 0.89 3.86 —2.96 
Other settings 
Family day care 0.00 1.39 ~1.39 
At home 0.45 1.88 —1.44 
Total 100.00 100.00 — 
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CACE 
Lottery winner Control group Estimated 

p-Value compliers compliers difference p-Value 
< .001 100.00 88.40 11.60 < .001 
< .001 100.00 0.00 100.00*** —< .001 
< .001 0.00 88.40 —88.40*** < .001 
< .001 0.00 47.57 —47.57*** << 001 

.008 0.00 17.14 —17.14*** .008 
< .001 0.00 11.54 —11.54*** << .001 

.067 0.00 12.15 —12.15 .067 

372 0.00 5.71 -5.71 372 

512 0.00 5.88 5.88 512 
— 100.00 100.00 — — 


Note. Care setting types were reported by parents at the time of application to Boston kindergarten (e.g., the winter, spring, or summer 
preceding kindergarten fall), were pulled from Boston prekindergarten enrollment records, or were pulled from age 4 state administra- 
tive records on traditional public school or charter school enrollment. Values were obtained from fitting our primary ITT and CACE 
equations with each care setting as the requisite outcome. Data were missing for 11.5% of students. Bolded numbers sum to 100. Statis- 
tical significance levels are indicated as: *p < .05; **p < .01; ***p < .001. ITT = intent-to-treat; CACE = complier average causal effect; 


BPS = Boston Public Schools. 


programs, 6% family day cares, and 6% were at 
home. 

In recent preschool evaluations, about a third to 
half of the control group has attended other center- 
based preschools programs (e.g., 34% in Tennessee, 
48% in Tulsa, about 50% in Head Start; Bloom & 
Weiland, 2015; Hill et al., 2015; Lipsey et al., 2018). 
These lottery-sample counterfactual findings accord- 
ingly are quite distinctive within the current evi- 
dence base. 


Impacts 


Examining lottery sample members’ K-3 enroll- 
ment in the BPS, we found that first choice lottery 
winners enrolled in the BPS at higher rates at each 
grade compared to the control group. As shown in 
Table 3 Column 2, effects of winning a first choice 
lottery (ITT) ranged from about 7 to 10 percentage 
points (p < .0001). The effects for compliers who 
enrolled in BPS (CACE, see Column 6) at each 
grade level K-3 were large, ranging from 24 to 34 
percentage points (p < .0001). There was also a 
large difference of 34 percentage points in consis- 
tent K-3 enrollment in the district between lottery 
winner compliers and control group compliers 
(p < .0001). Treatment and control group complier 
means shown in Column 8 and 9 further illuminate 
these findings; 74% of lottery winner compliers 


enrolled in BPS continuously from K-3 versus just 
39% of control group compliers. These findings 
demonstrate that pre-K enrollment markedly 
increased later enrollment in district public schools. 

For other examined outcomes—children’s grade 
retention, special education placement, and stan- 
dardized test scores—findings in Table 3 demon- 
strate that there were no effects of winning a first 
choice lottery (ITT; Column 2) nor of enrolling in 
Boston prekindergarten (CACE; Column 6). The 
effects of winning a first choice lottery (ITT) for 
these outcomes were uniformly small, close to zero 
in magnitude, and not statistically significant. Com- 
pliance rates across these outcomes were around 29 
percentage points (see Column 4; p < .0001). CACE 
estimates are larger than the ITT estimates; given 
the compliance rate of ~29 percentage points across 
outcomes, the magnitude of the CACE estimates 
reflects the low compliance rate difference. Also 
notable, CACE confidence intervals (Column 10) 
were relatively wide, ranging from substantially 
negative to substantially positive. For example, for 
“ever placed in special education,” the point esti- 
mate was 0.8% with a 95% confidence interval of 
-10% to 12%. 

For these outcomes too, the treatment and con- 
trol complier means are illuminating (see Columns 
8-9). In kindergarten, for example, very few stu- 
dents were retained in grade—1.6% of treatment 


12 Weiland et al. 


Table 3 


First Choice Lottery Sample ITT Impacts, Compliance Rates Difference, and CACE 


Lottery winner Control group 


First 
ITT First stage stage 
ITT (SE) (compliance) SE 

Enrolled in BPS (%) 

Prekindergarten 29.47*** 1.06 

Kindergarten 7.02*** 1.73 i229 01 

First grade 10.14*** 1.90 294** 01 

Second grade 8.49*** 2.00 29%** 01 

Third grade 7.54*** 2.04 294t* 01 

Enrolled K-3 10.05*** 2.07 DORE 01 

Ever enrolled 8.24*** 1.48 s2ORE* 01 
Retained in grade (%) 

Retained in kindergarten 0.04 0.59 s2ORe* 01 

Retained in first grade 1.24 0.80 2OF** 01 

Retained in second grade —0.03 0.62 2g*** 01 

Ever retained 1.22 1.10 30*** 01 
Special education classification (%) 

SPED in kindergarten —0.36 1.26 2O*** 01 

SPED in first grade 0.21 141 294 ¥* 01 

SPED in second grade 15 1.53 29*** 01 

SPED in third grade 0.01 1.66 294** 01 

Ever SPED 0.25 1.67 30*** 01 
MCAS and PARCC 

English language arts 0.01 0.04 298% 01 

Math —0.05 0.04. 2944 01 


CACE compliers compliers Confidence 
CACE (SE) mean mean interval 
100 0 = 
23.81*** = 5.97 91.06 67.25 12.51, 35.89 
34.41*** — 6.53 85.60 51.19 22.16, 48.77 
28.80*** 6.88 79.05 50.24 15.78, 42.76 
25.57*** — 7.02 75.77 50.2 12.22, 39.75 
34.10*** = 7.13 73.58 39.49 20.68, 48.62 
27.97*** 5:09 100 72.03 18.45, 38.39 
0.14 2.05 1.55 1.41 —3.87, 4.16 
4.28 2.77 4.19 —0.09 —1.14, 9.71 
—0.09 2.15 1.93 2.01 —4.30, 4.12 
4.20 3.79 7.50 3.31 —3.23, 11.63 
—1.23 4.36 7.47 8.71 —9.78, 7.30 
0.73 4.87 10.55 9.81 —8.82, 10.28 
5.18 5.27 13.56 8.39 —5.15, 15.51 
0.02 5.74 15.94 15.91 —11.22, 11.27 
0.84 5.77 17.27 16.42 —10.46, 12.15 
0.02 0.13 0.40 0.38 —0.24, 0.28 
—0.18 0.14 0.35 0.53 —0.45, 0.10 


Note. There was no missing data on enrollment variables. Other outcomes were missing data as follows: grade retention 3%~11% across 
variables; special education 4%-11% across variables; and test scores, 13%-16%. Note that we also calculated ITT and CACE effect sizes 
for MCAS and PARCC (the continuous outcomes) by dividing the estimated effect by the standard deviation of the control group and 
found they were nearly identical in magnitude to the ITT and CACE estimates shown in the table. Statistical significance levels are indi- 
cated as: *p < .05; **p < .01; ***p < .001. ITT = intent-to-treat; CACE = complier average causal effect; BPS = Boston Public Schools; 
MCAS = Massachusetts Comprehensive Assessment System; PARCC = Partnership for Assessment of Readiness for College and 


Careers. 


compliers and 1.4% of control group compliers. These 
levels are substantially below the aforementioned dis- 
trict average of 2.9% to 7.5% in Grades 1-3 at each 
grade level in BPS (Massachusetts Department of Ele- 
mentary & Secondary Education, 2013). For special 
education, mean levels for lottery compliers more clo- 
sely approximate the district average of 19% of BPS 
elementary-school students diagnosed with a disabil- 
ity. Specifically, about 16% of treatment and control 
compliers were classified as special education stu- 
dents in third grade. For state-standardized tests, both 
groups scored substantially higher than the average 
BPS third grader. In ELA, lottery compliers scored 
0.40 SD higher than the average BPS third grader, 
whereas control group members scored 0.38 SD 
higher than the average BPS third grader. In math, the 
means were 0.35 SD for lottery compliers and 0.53 for 
control compliers—a more sizable difference com- 
pared to other outcomes but statistically not signifi- 
cant. 


Differences in Students’ K-3 School Experiences 


To provide context for these results, we examined 
whether differences in students K-3 contexts might 
have driven our findings, concentrating on the differ- 
ences in contexts for compliers. Specifically, as 
described in the measures section earlier, we used 
publicly available school characteristics data and stu- 
dent enrollment records and calculated the average 
characteristics of the school students were enrolled in 
from grades K-3. We then analyzed these student- 
specific measures as outcomes using our standard 
CACE model. As shown in Table 4, there were some 
statistically significant, though relatively small, dif- 
ferences in 9 of 18 characteristics of treatment com- 
plier and control complier K-3 environments. For 
example, treatment compliers had fewer peers who 
were low-income students (66% vs. 72%, p < .05) 
and African American (26% vs. 37%, p < .001) than 
did control compliers. Treatment compliers also had 


Table 4 
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CACE Estimates of K-3 School Context Differences Between Lottery Winner Compliers and Control Group Compliers 


Lottery winner 


compliers 
Student background characteristics 
% Low-income 65.62 
% ELL 28.35 
% non-English home language 39.58 
% Disabilities 17.45 
% African American 26.34 
% Asian 8.07 
% Hispanic 40.43 
% White 22.17 
% Female 48.23 
Student performance—% proficient in third grade 
ELA 44.28 
Math 51.29 
Teacher and school characteristics 
% Licensed Ts 97.29 
Student-T ratio 13.76 
% Exemplary Ts 14.63 
% Proficient Ts 78.85 
% T retention 81.36 
Stability 89.12 
Avg class size 19.06 


Control group Estimated 
compliers difference p-Value 
72.21 —6.59* 019 
26.41 1.94 305 
35.77 3.81 065 
17.81 —0.36 631 
36.98 —10.64*** < .001 
10.39 —2.33* 034 
32.54 7.89** 001 
16.04 6.12* .030 
48.47 —0.24 511 
40.61 3.67 117 
47.69 3.60 144 
93.79 3.50** .003 
13.23 0.53* 011 
11.41 3.22 .109 
79.74 —0.89 .680 
78.42 2.94** .009 
83.90 5.2 1te* < .001 
18.44 0.61 258 


Note. Using publicly available data from the Massachusetts Department of Elementary and Secondary Education, we averaged available 
school-level data across the schools in which a student was enrolled for the longest period of time each year in kindergarten, first, second, 
and third grade. If data were missing for a student in a given year (e.g., first grade), we used nonmissing data to compute the student’s K-3 
context averages (e.g., K, second, third). Across variables, data were missing for 8%-11% of students overall and 5% of treatment students 
were missing data compared with 13% of their control group counterparts. Percentage of teachers scoring proficient or exemplary on 
state ratings and average class size was available for Cohorts 3 and 4 only. Statistical significance levels are indicated as: *p < .05; **p < .01; 
*#H < 001. CACE = complier average causal effect; ELA = English Language Arts; ELL, English Language Learners; Ts = teachers. 


more peers who were Hispanic (40% vs. 33%, 
p < .01) and White (22% vs. 16%, p < .05). They also 
experienced slightly more licensed teachers (97% vs. 
94%, p < .01) and were in schools with more stable 
student bodies (89% stable vs. 84%, p < .001). The 
percentage of children proficient on third-grade tests 
in children’s K-3 schools favored the treatment group 
compliers (e.g., 44% vs. 41% for ELA), but the differ- 
ence was not statistically significant. 

On the whole, while there were lottery-induced 
differences in students’ K-3 school experiences 
favoring the treatment group, these were relatively 
small. Both groups of students attended elementary 
schools in which their peers were majority low- 
income and in which the majority of teachers were 
rated as exemplary or proficient by the state’s 
teacher evaluation system. 


Robustness Checks 


As a robustness check, we fit third-grade out- 
come models in the first choice lottery sample 


without multiple imputation of covariates or out- 
comes and with multiple imputation for both 
covariates and outcomes (vs. our primary strategy 
of imputing covariates but not outcomes; see 
Appendix S5 Table 1). We also fit school context 
models that used characteristics of students’ K and 
third-grade schools only, in case averaging over 
different numbers of years for students with miss- 
ing data in one or more of their K-3 years was dis- 
torting or misrepresenting the schooling context 
differentials (Appendix S5 Table 2). We re-fit third- 
grade standardized test models dropping children 
who were ever retained, in case taking the test at 
an older age or in a different year from the rest of 
the cohort somehow biased our estimates even 
though there were no impacts on retention (for 
parsimony, these results are available upon 
request). Also, most of our outcomes were dichoto- 
mous; we used linear probability models as our 
primary modeling strategy because our sample size 
is well over the threshold for doing so and as 
these models are more straightforward (Angrist & 
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Pischke, 2008). We did, however, refit key models 
with dichotomous outcomes using logistic regres- 
sion as a sensitivity check (results available upon 
request). We also refit impact models with 
prekindergarten enrollment defined as_ being 
enrolled in at least 150 days of the school year 
(rather than 1 day; see Appendix 55 Table 3). We 
chose 150 days because of the distribution of the 
enrollment variable in our sample; there is no 
agreed-upon threshold in the literature and recent 
preschool studies have used different thresholds 
(Lipsey, Farran, & Hofer, 2015; Phillips, Gormley, 
& Anderson, 2016). Across these checks, we found 
no evidence that our main results were sensitive to 
our data analytic decisions. 

Finally, one assumption underlying our CACE 
analysis—that always-takers in both the treatment 
and control groups (i.e., children who would have 
enrolled in Boston prekindergarten regardless of 
their first choice treatment assignment status) expe- 
rienced the same effect of enrollment—is difficult to 
evaluate. Treatment group always-takers in our 
study largely enrolled in their first choice school; 
among control group crossovers, approximately a 
third did so. If the level at which a student ranked 
a prekindergarten program is indicative of their 
match with the program or its quality, it is possible 
that the two-thirds of the control group crossovers 
that enrolled in lower choices experienced a lower 
quality program. Empirically, when we compared 
the first choice schools and BPS schools actually 
attended in prekindergarten for the two-thirds of 
control group crossovers who did not enroll in 
their first choice but enrolled in a lower choice, we 
found that their first choice and their school 
attended differed on 9 of 13 school context charac- 
teristics, with first choice school appearing gener- 
ally somewhat higher quality than the school in 
which they actually enrolled. However, we also 
found that school-level context variables were only 
weakly correlated with observed prekindergarten 
process quality (Weiland & Unterman, 2019). Fur- 
thermore, among all control crossovers, about two- 
thirds were unassigned to the program after their 
first round; they were not assigned to a lower 
choice as part of the first round of the lottery sys- 
tem. These students crossed over later, which might 
indicate that their parents were particularly highly 
motivated and that therefore, they might have ben- 
efitted more from their Boston prekindergarten 
classroom. Ultimately, the direction of any potential 
bias from violation of the always-taker CACE 
assumption is ambiguous. 


Gauging External Validity 


Following other lottery-based studies (i.e., 
Abdulkadiroglu et al., 2011), we explored the exter- 
nal validity of our results using descriptive and 
quasi-experimental analyses. This work was impor- 
tant in our context, given that (as previously men- 
tioned) students in the lottery sample comprised 
25% of all appliers in focal years. Also, some 
schools were highly over-represented and others 
were under-represented in the lottery sample—for 
example, about half of the students competed for 
just seven schools (10% of schools with prekinder- 
garteners during this time period) and about 75% 
competed for just 18 schools (26% of schools with 
prekindergarteners during this time period). 

To explore external validity, we first compared 
the background characteristics of first choice lottery 
sample members to those of children in the full 
applicant sample. As shown in Appendix S6 
Table 1, while the two samples appeared to be 
similar in age, country of origin, and gender, the 
lottery sample was more economically advantaged 
and more likely to be White than all BPS 
prekindergarten applicants. About 51% of the lot- 
tery sample qualified for free-/reduced-price lunch, 
whereas 65% of all BPS appliers did. Regarding 
students’ race/ethnicity, White students comprised 
28% of the lottery sample versus 17% of all BPS 
prekindergarten appliers; Hispanic students com- 
prised 39% of the lottery sample versus 44% of all 
BPS prekindergarten appliers. About 21% of the 
lottery sample was Black versus 28% of the full 
applicant sample. Fifty-seven percent of the lottery 
sample spoke English at home versus 50% of the 
full sample. 

Next, we compared the comparison group care set- 
tings of our lottery sample to the full applicant sam- 
ple. Among children whose families applied to the 
BPS prekindergarten but did not enroll (ie., full 
sample nonenrollees), 76% attended a non-BPS cen- 
ter-based preschool (vs. 97% of lottery control 
group members and 88% of lottery control group 
compliers) and types were markedly different from 
those in the lottery control group (see Table 2 and 
Appendix S6 Table 6). For example, 37% of full 
sample nonenrollees attended private centers, 26% 
attended Head Start, and 13% attended other public 
programs. For lottery control compliers, 48% 
attended private centers, 17% attended Head Start, 
and 24% attended other public programs. In all, 
18% of full sample nonenrollees were at home ver- 
sus 6% of lottery control group compliers. 


Using K-3 school context data, we also examined 
the representativeness of schools in our lottery-based 
analysis. We defined over-represented schools as 
schools for which 50% (N=6 schools) or 75% 
(N = 17 schools) of first choice lottery sample mem- 
bers competed. As shown in Appendix 51 Table 4, 
over-represented schools (75% threshold) in our lot- 
tery-based study had a considerably lower average 
percentage of students from low-income families 
compared to other district schools (64% vs. 77%, 
respectively; p < .001), proportionately more White 
students (24% vs. 13%, respectively; p < .01), more 
teachers rated as exemplary by administrators 
under the state’s teacher evaluation system (23% vs. 
14%, p < .05; Massachusetts Department of Elemen- 
tary & Secondary Education, 2017), and more third 
graders scoring advanced/proficient on state stan- 
dardized tests (47% vs. 33% for ELA, p < .001; 54% 
vs. 41% for math, p < .01, respectively). Our lottery 
estimates therefore are heavily weighted toward 
applicants to schools with more advantaged, higher 
performing students than in the district overall, 
though the over-represented schools also enrolled 
majority low-income and non-White students. 

Finally, we used a propensity-score approach to 
estimate the relationship between BPS prekinder- 
garten enrollment and our key outcomes and thus 
to examine the representativeness of our lottery- 
based estimates. Specifically, we predicted the proba- 
bility that a student would be treated conditional 
on their background characteristics, their cohort 
year, and the public school each student lived clos- 
est to as a proxy for neighborhood characteristics. 
We then inverted these propensities to obtain an 
inverse probability weight (IPW) that we could use 
in our subsequent regression analysis to counteract 
selection into the program (Imbens & Wooldridge, 
2009; Murnane & Willett, 2010). The covariates 
available for this work are the key covariates in our 
impacts work (i.e., race/ethnicity, gender, free/re- 
duced lunch, age, country of origin, and home lan- 
guage; see Table 1). The exception is that in our 
additional analysis we add a fixed effect for the 
closest public elementary school to the student, a 
proxy for neighborhood which we use because lot- 
tery blocks by definition are not available for the 
full sample. These covariates are considerably less 
rich than those in some other recent prekinder- 
garten evaluations which have been able to include 
covariates such as parent education, home literacy 
measures, Internet availability in the home, and 
number of working parents (Hill et al., 2015; Lipsey 
et al., 2015; Phillips et al., 2016). Therefore, to gauge 
whether they captured selection into the program 
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and following Abdulkadiroglu et al. (2011), we 
began by replicating our lottery-based findings first. 

Specifically, we replicated the lottery-based ITT 
findings by estimating the association between 
being in the first choice lottery sample treatment 
group with grade retention, special education, and 
test scores in third grade using the sample of all 
students who applied to Boston prekindergarten 
during the four application rounds (N ~ 9,700). As 
shown in Table 5, with this replication sample, we 
found results that were very similar to our lottery 
ITT estimates, with the exceptions of a marginally 
significant and larger result on special education 
placement in Kindergarten (0.36 percentage points 
ITT compared with —1.68 percentage points for the 
replication sample) and a marginally significant 
result of similar magnitude on third-grade math 
scores (—0.05 ITT and —0.06 replication). We then 
used our IPW approach to estimate the association 
between BPS prekindergarten enrollment and later 
outcomes on the full sample of prekindergarten 
applicants and enrollees (N ~ 11,790), effectively 
including in the replication sample an additional 
group of students who enrolled in prekindergarten 
but did not apply through the standard process 
(see Appendix S6 for more information on available 
data in the full sample and other details on our 
IPW approach). With the full sample, we find larger 
and statistically significant associations between 
prekindergarten enrollment and grade retention 
outcomes (enrollees were 4 percentage points less 
likely to be retained in grades K-3, p < .001), special 
education placement (enrollees were 7 percentage 
points less likely to be placed in special education 
in grade K-3, p < .001), and MCAS scores (enrollees 
scored 0.04 SDs higher than the average BPS third 
grader on both Math and ELA, p < .05). We view 
these findings as best interpreted as associations for 
gauging external validity and not as causal esti- 
mates; the internal validity of our IPW findings is 
bolstered somewhat by the lottery-based validation 
but ultimately, it is difficult to assess the internal 
validity of these findings. 

Taken together, our analyses gauging external 
validity point to a first choice lottery sample that 
was more advantaged than the full applicant sam- 
ple and raise caution in generalizing our lottery 
sample findings to all applicants and enrollees. 


Discussion 


While the evidence is clear that children who attend 
preschool have stronger school readiness skills at 
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kindergarten entry than children who do not attend 
preschool (Duncan & Magnuson, 2013; Phillips 
et al., 2017), the longer run evidence base on large- 
scale prekindergarten programs is just emerging. In 
the current study, we used a rigorous lottery-based 
approach as a window into the effects of one such 
program, the BPS prekindergarten program, on key 
child outcomes through the end of third grade. We 
also examined counterfactual care settings and K-3 
settings for first choice lottery-sample children to 
contextualize our results and we explored the gen- 
eralizability of our key results beyond the lottery 
sample. 

For special education placement, retention, and 
standardized test scores, in the first choice lottery 
sample, we found no differences in outcomes 
through third grade between first choice lottery 
winners who enrolled in BPS prekindergarten and 
control group members who did not. We did find 
evidence that Boston prekindergarten succeeded in 
drawing families into the BPS and in retaining 
them, which was one of the program’s original 
goals. Effects for compliers on enrollment and per- 
sistence in the BPS were large—about 91% of lot- 
tery winners who enrolled in Boston 
prekindergarten also enrolled in BPS kindergarten, 
versus just 67% of control group members who did 
not enroll in Boston prekindergarten. Overall, 74% 
of lottery winner compliers were enrolled in BPS 
from K-3 versus only 39% of control compliers. In 
increasingly competitive urban educational markets, 


Table 5 


offering prekindergarten in the public schools 
appears to be one avenue for attracting and retain- 
ing families that might otherwise enroll elsewhere. 

Notably results like these—medium-term conver- 
gence of outcomes for prekindergarten attenders 
and nonattenders in the early elementary grades in 
our lottery sample—is a common (though not uni- 
versal) finding overall in the literature (Phillips 
et al., 2017). The why behind this pattern is a puzzle 
and one that likely has no consistent answer across 
study contexts, given the wide range in program 
quality, counterfactuals, child demographics, and 
elementary school quality nationally. In our context, 
our results could be seen as surprising, given the 
high quality of the Boston program and its promis- 
ing short-term effects on children’s school readiness 
(Weiland & Yoshikawa, 2013). Several factors are 
highly important in placing our results within the 
broader context. 

First, as we emphasize throughout the article, 
our analysis is not an evaluation of the effects of 
all Boston prekindergarten programs, for the full 
sample of children who attended. Rather, lotteries 
were highly concentrated in a subset of schools; 
75% of lottery applicants, for example, competed 
for about a quarter of eligible district schools. There 
were also important differences between children in 
our first choice lottery-based sample and the full 
sample generally appearing more advantaged. On 
the one hand, more popular schools might be 
higher quality and thus more effective and thus we 


Lottery ITT Results and IPW Results for the Lottery Replication Sample and the Full Prekindergarten Applicant Sample 


Lottery sample (ITT) 


Retained in grade (%) 


Retained in kindergarten 0.04 (0.59) 
Retained in first grade 1.24 (0.80) 
Retained in second grade —0.03 (0.62) 
Ever retained 1.22 (1.10) 
Special education classification (%) 
SPED in kindergarten —0.36 (1.26) 
SPED in first grade 0.21 (1.41) 
SPED in second grade 1.50 (1.53) 
SPED in third grade 0.01 (1.66) 
Ever SPED 0.25 (1.67) 
Third-grade test scores 
English language arts 0.01 (0.04) 
Math —0.05 (0.04) 


Replication sample Full sample 


—0.59 (0.37) ~2.46 (0.38)*** 
0.58 (0.57) ~1.36 (0.45)** 
—0.35 (0.45) —0.31 (0.37) 
~0.26 (0.77) ~3.87 (0.64)*** 
—1.68* (0.83) —5.28 (0.70)*** 
—0.61 (0.96) —5.92 (0.76)*** 
0.48 (1.06) —5.31 (0.81)*** 
0.47 (1.11) —5.78 (0.85)*** 
—0.01 (1.12) —6.51 (0.85)*** 
—0.01 (0.03) 0.04 (0.02)* 
—0.06* (0.03) 0.04 (0.02)* 


Note. For the lottery sample (ITT), outcomes were missing data as follows: grade retention 3%~11% across variables; special education 
4%-11% across variables; and test scores, 13%-16%. For the replication sample, outcomes were missing data as follows: grade retention 
6%-10% across variables; special education 3%-10% across variables; and test scores, 15%. For full sample, outcomes were missing data 
as follows: grade retention 9%~—14% across variables; special education 6%-14% across variables; and test scores, 20%. Statistical signifi- 
cance levels are indicated as: *p < .05; **p < .01; ***p < .001. ITT = intent-to-treat; IPW = inverse probability weight. 


might have expected more persistence of impacts 
among this sample. However, our generalizability 
work showed that students enrolled in prekinder- 
garten in these schools appeared more advantaged 
and thus might have been less likely to benefit from 
the program than their less advantaged peers in 
less popular schools. In that case, less persistence of 
impacts might be expected among the lottery com- 
pliers. Supporting this hypothesis, in our propensity 
score work, we found associations suggestive of 
small benefits for the full population of Boston 
prekindergarten enrollees on all examined out- 
comes. Descriptively, lottery sample control group 
compliers were also quite high performing, scoring 
0.38-0.53 SD higher than the average BPS third 
grader on standardized math and literacy tests, ver- 
sus 0.15-0.17 SD for the full sample non-enrollees. 
Non-enrollees were also less likely to attend 
other preschool programs than lottery sample 
control group compliers and more likely to persist 
in BPS. 

Second, previous research has shown that the 
counterfactual matters greatly in preschool studies 
(Feller et al., 2016). In our lottery sample, a large 
majority of the lottery control group attended a 
center-based preschool program (97% ITT, 88% 
CACE; see Table 2). More typically, about a third 
to half of the control group has attended other cen- 
ter-based preschools programs in large-scale causal 
evaluations of publicly funded preschool programs 
(e.g., 34% in Tennessee, 48% in Tulsa, about 50% in 
Head Start; Bloom & Weiland, 2015; Hill et al., 
2015; Lipsey et al., 2015). Ours is not a test of pre- 
school versus no preschool; rather, our results indi- 
cate that compliers who attended a free public 
prekindergarten program versus largely a mix of 
other preschool programs did about equally well at 
the end of third grade. This may be because ulti- 
mately, the treatment-control contrast (Bloom & 
Weiland, 2015) may not have been large enough to 
generate lasting impacts for our lottery sample. 
Unfortunately, we lacked information on the qual- 
ity of control group care settings that would have 
allowed us to identify the full treatment-control 
contrast. Notably, in the regression-discontinuity 
(RD) evaluation of the Boston program described 
earlier in this article that found strong impacts on 
children’s school readiness skills, parents of control 
group children reported that in the year their chil- 
dren were too young to enter the Boston prekinder- 
garten program (e.g., their age 3 year), 57% 
experienced another type of center-based care and 
33% were in parental care (Weiland & Yoshikawa, 
2013)—considerably higher than the national 


Oversubscribed Prekindergarten Impacts 17 


average of 42% of 3-year-olds enrolled in preschool 
programs (Whitehurst & Klein, 2015) but far fewer 
than in our lottery-based study. 

A third reason for nuanced interpretation is that 
we lack information on children’s kindergarten 
entry skills and thus were unable to identify 
whether lottery complier children had experienced 
an initial boost from Boston prekindergarten com- 
pared to control compliers. In other words, inter- 
preting our results as either surprising or expected 
in terms of persistence is complicated by not know- 
ing whether compliers’ experienced benefits from 
the program in the first place. Underscoring this 
point, few of the children who participated in the 
previous RD evaluation study of Boston prekinder- 
garten that showed strong impacts on kindergarten 
readiness (Weiland & Yoshikawa, 2013) were 
included in our lottery sample. Specifically, the RD 
sample represented approximately 85% of district 
schools and 70% of eligible children in those 
schools in 2008-2009. Only 125 children were in 
both the previous study’s RD treatment group and 
the current study’s lottery winner group—constitut- 
ing about 47% of the lottery treatment group for 
the 2008-2009 school year and only 13% of the RD 
treatment group overall. 

The K-3 schooling experiences of children in our 
sample are also important to highlight to place our 
results in the context of the “sustaining environ- 
ments hypothesis’—the idea that sustaining the 
boost from preschool depends on the quality of K+ 
schooling environments (Bailey et al., 2017). So far, 
the evidence on this hypothesis is mixed (Bassok, 
Gibbs, & Latham, 2018; Bierman et al., 2014; Cle- 
ments, Sarama, Wolfe, & Spitler, 2013; Jenkins 
et al., 2018; Johnson, 2013; Swain, Springer, & 
Hofer, 2015; Zhai, Raver, & Jones, 2012). We found 
that the quality of K-3 programming in Boston was 
lower on average than that of the district’s 
prekindergarten program (see Appendix S1 Table 1). 
Notably, Massachusetts and Boston do show higher 
performance relative to other states and similar dis- 
tricts nationally, respectively (National Center for 
Education Statistics, 2013; Reardon, 2017). But rela- 
tive to other districts in the state, Boston in our 
focal years had relatively weak third-grade perfor- 
mance, scoring around the bottom 11% of districts 
on the state third-grade standardized math test and 
the bottom 5% of districts for third-grade reading 
(Massachusetts Department of Elementary & Sec- 
ondary Education, 2014). Also, for our study’s 
cohort years, prekindergarten to third-grade align- 
ment reforms (see Boston Public Schools, 2017) had 
not yet taken place in the district. Prekindergarten 
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attenders during our study years may have 
repeated some of the same content in kindergarten, 
offering an opportunity for control compliers to 
catch up; content repetition has been associated 
with less growth in kindergarteners’ math skills in 
a nationally representative study (Engel, Claessens, 
& Finch, 2013). However, ultimately, simply know- 
ing that K-3 quality was lower than prekinder- 
garten quality and that Boston scored lower than 
most other districts does not answer the question of 
whether there is a threshold of quality needed to 
sustain effects. More nuanced measurement would 
have been required to answer questions about 
threshold effects. 

In addition to contributing to the field’s under- 
standing of medium-term convergence patterns, our 
study’s lottery-based design also has methodologi- 
cal implications for the field. As recently reviewed 
by a group of experts (Phillips et al., 2017), the 
rigor of longitudinal studies of today’s large-scale 
preschool programs thus far has been mixed. This 
is due in part to the difficulties of randomly assign- 
ing children to a given preschool program in locali- 
ties that already have universal preschool (ie., 
Oklahoma, West Virginia), as well as to difficulties 
not unique to preschool in gaining buy-in/agree- 
ment from local stakeholders. However, with the 
recent expansion of public preschool programs in 
contexts that, like Boston, use lottery-based assign- 
ment algorithms to assign children to preschool 
(e.g., Washington DC, Denver, San Francisco, New 
York, and New Orleans), the field is seemingly 
poised for additional rigorous studies of the impact 
of public preschool. 

Our lottery sample findings drive home the 
importance of understanding the characteristics of 
students in a city-based school lottery versus all 
students receiving the program and the lottery- 
induced treatment contrast, especially within natu- 
rally occurring randomized trials. This may be 
particularly important in contexts with prekinder- 
garten programs that, like Boston’s, are open to 
families of all income levels. Families with higher 
social capital are likely to be better at navigating 
choice and lottery systems than other families and 
may be over-represented in prekindergarten lottery 
studies similar to ours. In addition, as explained 
earlier, our study’s lottery-induced treatment con- 
trast amounted to comparing sample members first 
choice Boston prekindergarten programs to other 
mostly private and other public preschool options. 
This is generally not the policy question of interest 
to policymakers seeking to expand access to pre- 
school. Given that most 4-year-olds now attend 


some form of center-based care, future lottery stud- 
ies may likely to encounter this situation as well. 
Depending on the context, future lottery studies 
may be better poised to compare different pre- 
school programs to each other than to answer the 
preschool versus none question. 

There are several limitations that should be high- 
lighted. The measures in our study were limited to 
those available via administrative records. Measures 
of other important school readiness and _ success 
skills such as children’s socioemotional and execu- 
tive function skills were not available. Our knowl- 
edge of the program’s effects is accordingly more 
limited than we would like. Also, as explained in 
the robustness check section, one assumption 
underlying our CACE analysis—that always-takers 
in both the treatment and control groups (i.e., chil- 
dren who would have enrolled in Boston 
prekindergarten regardless of their treatment 
assignment status) experienced the same effect of 
enrollment—is difficult to evaluate. 

In closing, unpacking the preschool convergence 
phenomenon is one of the most pressing issues fac- 
ing the field of early education research (Phillips 
et al., 2017). Rigorous research on today’s programs 
is beginning to catch up to the rapid pace of pre- 
school expansion nationally, through efforts like the 
present article; efforts in North Carolina (Dodge, Bai, 
Ladd, & Muschkin, 2016), Tulsa (Hill et al., 2015; 
Phillips et al., 2017), and Tennessee (Lipsey et al., 
2018); and the five place-based teams (including in 
Boston; McCormick et al., 2019) tracking children 
from preschool to third grade in the Institute of Edu- 
cation Sciences (2016) Early Learning Network. In 
addition, the field is potentially poised for additional 
rigorous lottery-based studies that permit longitudi- 
nal analysis like in this article—though it remains to 
be seen what policy questions these studies will be 
able to answer. Our lottery-based findings, combined 
with our analysis of the relevant counterfactual and 
our quasi-experimental work on the full sample, con- 
tribute to the new generation of public preschool 
studies that will hopefully help point the way to 
ensuring a stronger, lasting boost for all children. 
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